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SEQUENCE LISTING 



(1) GENERAL INFORMATION: 



10 



(i) APPLICANT: 

(A) NAME: Hoechst Aktiengesellschaf t 

(B) STREET: - 

(C) CITY: Frankfurt 

(D) STATE: - 

(E) COUNTRY: Germany 

(F) POSTAL CODE (ZIP) : 65926 



(G) TELEPHONET~069-3 05-7072 

(H) TELEFAX: 069-35-7175 

(I) TELEX: - 



15 



20 



=25 



(ii) TITLE OF INVENTION: Purification of higher order/ transcription 
complexes from transgenic non-human animals 

(iii) NUMBER OF SEQUENCES: 17 

(iv) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS/MS-DOS 

(D) SOFTWARE: Patentln Release #1.0, .Version #1.25 (EPO) 



s 30 (2) INFORMATION FOR SEQ ID NO: 1 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



.35 



40 



45 



50 



(ii) MOLECULE TYPE: peptide 




(ix) FEATURE : 

(A) NAME /KEY: Peptide 

(B) LOCATION: J/T.12 



(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 1: 



Met Gly 
1 



Pro Tyr Asp Val Pro Asp Tyr Ala Val 
5 10 



55 



(2) INFORMATION FOR SEQ ID NO: 2 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11 amino acids 



42 
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(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

5 (ii) MOLECULE TYPE: peptide 




(ix) FEATURE: 

(A) NAME /KEY: Peptide 
10 (B) LOCATION: 1..11 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 



15 Met Gly Tyr Pro Tyr Asp Val Pro Asp Tyr Ala 

15 10 



20 



40 



50 



(2) INFORMATION FOR SEQ ID NO: 3 



(i) SEQUENCE CHARACTERISTICS: 

* (A) LENGTH: 10 amino acids 

J (B) TYPE: amino acid 

J (C) STRANDEDNESS: single 

!25 (D) TOPOLOGY: linear 

f (ii) MOLECULE TYPE: peptide 



yJ30 (ix) FEATURE: 

s (A) NAME/KEY: Peptide 

O (B) LOCATION: 1..10 



35 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

Gly Tyr Pro Tyr Asp Val Pro Asp Tyr Ala 
15 10 



(2) INFORMATION FOR SEQ ID NO: 4: 



(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 9 amino acids 
45 (B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: peptide 



(ix) FEATURE: 

(A) NAME /KEY: Peptide 

(B) LOCATION: 1..9 



55 



43 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4 

Tyr Pro Tyr Asp Val Pro Asp Tyr Ala 
1 5 



(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 
10 (A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



T5^ 



(£T) MOLECULE TYPE: cDNA 



20 



( ix) FEATURE : 

(A) NAME/KEY: exon 

(B) LOCATION: 1. .22 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 
l 225 GGAGCAACCG CCTGCTGGGT GC 



22 



"35 



(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



40 



(ix) FEATURE: 

(A) NAME /KEY: exon 

(B) LOCATION: 1..21 



45 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6 
CCTGTGTTGC CTGCTGGGAC G 



21 



(2) INFORMATION FOR SEQ ID NO: 7- 

50 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
55 (D) TOPOLOGY: linear 



44 



10 



M 35 
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(ii) MOLECULE TYPE: cDNA 



( ix) FEATURE : 

(A) NAME /KEY : exon 

(B) LOCATION : 1..21 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 
GGAGACTGAA GTTAGGCCAG C 21 

(2) INFORMATION FOR SEQ ID NO: 8: 

15 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 76 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
20 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



25 (ix) FEATURE: 

(A) NAME/KEY: exon 

(B) LOCATION: 1..76 



30 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 

GCGGCACCAG GCCGCTGCTG TGATGATGAT GATGATGGCT GCTGCCCATG ACTGCGTAAT 60 
GCGGTCATGA CGCTTT 76 



(2) INFORMATION FOR SEQ ID NO: 9: 



(i) SEQUENCE CHARACTERISTICS: 
40 (A) LENGTH: 75 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

45 (ii) MOLECULE TYPE: cDNA 



(ix) FEATURE: 

(A) NAME /KEY: exon 
50 (B) LOCATION: 1..75 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 
55 GAAGGGGGTG GGGGAGGCAA GGGTACATGA GAGCCATTAC GTCGTCTTCC TGAATCCCTT 60 



45 



# # 
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TAGCCGCTTT GCTCG 75 

(2) INFORMATION FOR SEQ ID NO: 10: 

5 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
10 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA • • 



15 (ix) FEATURE: 

(A) NAME/KEY: exon 

(B) LOCATION: 1..22 



20 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

p CCCTATGACG TCCCGGATTA CG 22 

IS 25 (2) INFORMATION FOR SEQ ID NO: 11: 

Jfl (i) SEQUENCE CHARACTERISTICS: 

f !f (A) LENGTH: 22 base pairs 

7! (B) TYPE: nucleic acid 

^ 30 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



35 



40 



45 



(ii) MOLECULE TYPE: cDNA 



(ix) FEATURE: 

(A) NAME /KEY: exon 

(B) LOCATION: 1..22 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 
GTGGAGTGGT GCCCGGCAAG GG 22 

(2) INFORMATION FOR SEQ ID NO: 12: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 amino acids 
50 (B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



55 



(ii) MOLECULE TYPE: peptide 



46 
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( ix) FEATURE : 

(A) NAME/KEY: Peptide 

(B) LOCATION: 1..19 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 

Met Gly Ser Ser His His His His His His Ser Ser Gly Leu Val Pro 
15 10 15 

10 

Arg Gly Cys 



15 (2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1310 base pairs 

(B) TYPE: nucleic acid 
20 (C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



25 



-r- 

W 30 



(ix) FEATURE: 

(A) NAME /KEY: exon 

(B) LOCATION: 1..1310 



= = (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 

IH CCATGGGCTA TCCCTATGAC GTCCCGGATT ACGCAGTCAT GGGCAGCAGC CATCATCATC 60 

S| 35 ATCATCACAG CAGCGGCCTG GTGCCGCGCG GCAGCCATAT GGATCAGAAC AACAGCCTGC 120 

|I CACCTTACGC TCAGGGCTTG GCCTCCCCTC AGGGTGCCAT GACTCCCGGA ATCCCTATCT 180 

3" 

TTAGTCCAAT GATGCCTTAT GGCACTGGAC TGACCCCACA GCCTATTCAG AACACCAATA 240 

40 

GTCTGTCTAT TTTGGAAGAG CAACAAAGGC AGCAGCAGCA ACAACAACAG CAGCAGCAGC 300 

AGCAGCAGCA GCAGCAACAG CAACAGCAGC AGCAGCAGCA GCAGCAGCAG CAGCAGCAGC 360 

45 AGCAGCAGCA GCAGCAGCAA CAGGCAGTGG CAGCTGCAGC CGTTCAGCAG TCAACGTCCC 420 

AGCAGGCAAC ACAGGGAACC TCAGGCCAGG CACCACAGCT CTTCCACTCA CAGACTCTCA 480 

CAACTGCACC CTTGCCGGGC ACCACTCCAC TGTATCCCTC CCCCATGACT CCCATGACCC 540 

50 

CCATCACTCC TGCCACGCCA GCTTCGGAGA GTTCTGGGAT TGTACCGCAG CTGCAAAATA 600 

TTGTATCCAC AGTGAATCTT GGTTGTAAAC TTGACCTAAA GACCATTGCA CTTCGTGCCC 660 

55 GAAACGCCGA ATATAATCCC AAGCGGTTTG CTGCGGTAAT CATGAGGATA AGAGAGCCAC 720 

47 
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GAACCACGGC ACTGATTTTC AGTTCTGGGA AAATGGTGTG CACAGGAGCC AAGAGTGAAG 780 

AACAGTC CAG ACTGGCAGCA AGAAAATATG CTAGAGTTGT ACAGAAGTTG GGTTTTCCAG 840 

CTAAGTTCTT GGACTTCAAG ATTCAGAACA TGGTGGGGAG CTGTGATGTG AAGTTTCCTA 900 

TAAGGTTAGA AGGCCTTGTG CTCACCCACC AACAATTTAG TAGTTATGAG CCAGAGTTAT 960 

TTCCTGGTTT AATCTACAGA ATGATCAAAC CCAGAATTGT TCTCCTTATT TTTGTTTCTG 1020 

GAAAAGTTGT ATTAACAGGT GCTAAAGTCA GAGCAGAAAT TTATGAAGCA TTTGAAAACA 1080 

TCTACCCTAT TCTAAAGGGA TTCAGGAAGA CGACGTAATG GCTCTCATGT ACCCTTGCCT 1140 

15 CCCCCACCCC CTTCTTTTTT TTTTTTTAAA CAAATCAGTT TGTTTTGGTA CCTTTAAATG 1200 

GTGGTGTTGT GAGAAGATGG ATGTTGAGTT GCAGGGTGTG GCACCAGGTG ATGCCCTTCT 1260 

GTAAGTGCCC CTTCCGGCAT CCCGGAATTC CTGCAGCCCA ACGCGGCCGC 1310 



10 



20 



55 



(2) INFORMATION FOR SEQ ID NO: 14: 



S (i) SEQUENCE CHARACTERISTICS: 

l ~ 25 (A) LENGTH: 4286 base pairs 

II (B) TYPE: nucleic acid 

JJ= (C) STRANDEDNESS : single 

s"y (D) TOPOLOGY: linear 

W 30 (ii) MOLECULE TYPE: cDNA 



(ix) FEATURE : 

(A) NAME /KEY: exon 
35 (B) LOCATION: 1..4286 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 

40 GAATTCCCCT GCAGGTCACT TAGCGTTGGC CACATAGTAG GTTCTCAAAT ACTTGTTAAT 60 

AAATAAGTTT GTTCGAGAAG CTGGGCAATG ATATTCTACA GCTGGAAGAA GAAACATAAT 120 

GATCTAGTAA TTAGCTCAAT TAAAAATAAA CGTTCTTCTT TCCTCAGAGG AGCATTTCCC 180 

45 

AAGGCCTGCC TTGATAGCCA TCCAAAAAGG CCAAGCTCAT CCAATCTTGC C CTAGATTTA 240 

TGCTAAAATG CAGTTACAAT CGATAGGATG ACAGAAAACG ACAGCACTTA TTTAAATATA 300 

50 ATAGGCACTT ATTTAAATAG GAGAAGCTGT GACTTCATAG CAAGTGTTGG GGTTAGGAAA 360 

CTGGGTGGAT AAACTTGCTG ATGCTGTAGA TCTTAGCCTC TACATGAGAT CATGTGGAAA 420 

ATCTGAAAGC ATTTTAGGTT CCTTATGTTT GCAATCAAAT AACTGTACAC CTTTTAATTT 480 



AAAAAGTACC ATGAGGCACA CACACACACT CGCAGGAACT TTTTGGCGTA ACAAAACTAG 540 

48 



Attorney Docket No. 026083/0173 



AATTAGATCT AAAAGCTAAC TGTAGGACTG AGTCTATTCT AAACTGAAAG CCTGGACATC 600 

TGGAGTACCA GGGGGAGATG ACGTGTTACG GGCTTCCATA AAAGCAGCTG GCTTTGAATG 660 

5 

GAAGGAGCCA AGAGGCCAGC ACAGGAGCGG ATTCGTCGCT TTCACGGCCA TCGAGCCGAA 720 

CCTCTCGCAA GTCCGTGAGC CGTTAAGGAG GCCCCCAGTC CCGACCCTTC GCCCCAAGCC 780 

10 CCTCGGGGTC CCCGGGCCTG GTACTCCTTG CCACACGGGA GGGGCGCGGA AGCCGGGGCG 840 

GAGGAGGAGC CAACCCCGGG CTGGGCTGAG ACCCGCAGAG GAAGACGCTC TAGGGATTTG 900 

15 

CCCTAATCTC AATACAACCT TTGGAGCTAA GCCAGCAATG GTAGAGGGAA GATTCTGCAC 1020 

GTCCCTTCCA GGCGGCCTCC CCGTCACCAC CCCCCCCAAC CCGCCCCGAC CGGAGCTGAG 1080 

20 AGTAATTCAT ACAAAAGGAC TCGCCCCTGC CTTGGGGAAT CCCAGGGACC GTCGTTAAAC 1140 

TCCCACTAAC GTAGAACCCA GAGATCGCTG CGTTCCCGCC CCCTCACCCG CCCGCTCTCG 1200 

TCATCACTGA GGTGGAGAAG AGCATGCGTG AGGCTCCGGT GCCCGTCAGT GGGCAGAGCG 1260 

25 

CACATCGCCC ACAGTCCCCG AGAAGTTGGG GGGAGGGGTC GGCAATTGAA CCGGTGCCTA 1320 

GAGAAGGTGG CGCGGGGTAA ACTGGGAAAG TGATGTCGTG TACTGGCTCC GCCTTTTTCC 1380 

30 CGAGGGTGGG GGAGAACCGT ATATAAGTGC AGTAGTCGCC GTGAACGTTC TTTTTCGCAA 1440 

CGGGTTTGCC GCCAGAACAC AGGTAAGTGC CGTGTGTGGT TCCCGCGGGC CTGGCCTCTT 1500 

TACGGGTTAT GGCCCTTGCG TGCCTTGAAT TACTTCCACG CCCCTGGCTG CAGTACGTGA 1560 

35 

TTCTTGATCC CGAGCTTCGG GTTGGAAGTG GGTGGGAGAG TTCGAGGCCT TGCGCTTAAG 1620 

GAGCCCCTTC GCCTCGTGCT TGAGTTGAGG CCTGGCCTGG GCGCTGGGGC CGCCGCGTGC 1680 

40 GAATCTGGTG GCACCTTCGC GCCTGTCTCG CTGCTTTCGA TAAGTCTCTA GCCATTTAAA 1740 

ATTTTTGATG ACCTGCTGCG ACGCTTTTTT TCTGGCAAGA TAGTCTTGTA AATGCGGGCC 1800 

AAGATCTGCA CACTGGTATT TCGGTTTTTG GGGCCGCGGG CGGCGACGGG GCCCGTGCGT 1860 

45 

CCCAGCGCAC ATGTTCGGCG AGGCGGGGCC TGCGAGCGCG GCCACCGAGA ATCGGACGGG 1920 

GGTAGTCTCA AGCTGGCCGG CCTGCTCTGG TGCCTGGCCT CGCGCCGCCG TGTATCGCCC 1980 

50 CGCCCTGGGC GGCAAGGCTG GCCCGGTCGG CACCAGTTGC GTGAGCGGAA AGATGGCCGC 2040 

TTCCCGGCCC TGCTGCAGGG AGCTCAAAAT GGAGGACGCG GCGCTCGGGA GAGCGGGCGG 2100 

GTGAGTCACC CACACAAAGG AAAAGGGCCT TTCCGTCCTC AGCCGTCGCT TCATGTGACT 2160 

55 

CCACGGAGTA CCGGGCGCCG TCCAGGCACC TCGATTAGTT CTCGAGCTTT TGGAGTACGT 2220 

49 



Z 25 



Ki 35 



55 



♦ * 
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CGTCTTTAGG TTGGGGGGAG GGGTTTTATG CGATGGAGTT TCCCCACACT GAGTGGGTGG 2280 

AGACTGAAGT TAGGCCAGCT TGGCACTTGA TGTAATTCTC CTTGGAATTT GCCCTTTTTG 2340 

5 

AGTTTGGATC TTGGTTCATT CTCAAGCCTC AGACAGTGGT TCAAAGTTTT TTTCTTCCAT 2400 

TTCAGGTGTC GTGAGGAATT GCCCGGGGGA TCCATGGGCT ATCCCTATGA CGTCCCGGAT 2460 

10 TACGCAGTCA TGGGCAGCAG CCATCATCAT CATCATCACA GCAGCGGCCT GGTGCCGCGC 2520 

GGCAGCCATA TGGATCAGAA CAACAGCCTG CCACCTTACG CTCAGGGCTT GGCCTCCCCT 2580 

CAGGGTGCCA TGACTCCCGG AATCCCTATC TTTAGTCCAA TGATGCCTTA TGGCACTGGA 2640 

15 

CTGACCCCAC AGCCTATTCA G AACAC CAAT AGTCTGTCTA TTTTGGAAGA GCAACAAAGG 2700 

CAGCAGCAGC AACAACAACA GCAGCAGCAG CAGCAGCAGC AGCAGCAACA GCAACAGCAG 2760 

20 CAGCAGCAGC AGCAGCAGCA GCAGCAGCAG CAGCAGCAGC AGCAGCAGCA ACAGGCAGTG 2820 

GCAGCTGCAG CCGTTCAGCA GTCAACGTCC CAGCAGGCAA CACAGGGAAC CTCAGGCCAG 2880 

GCACCACAGC TCTTCCACTC ACAGACTCTC ACAACTGCAC CCTTGCCGGG CACCACTCCA 2940 

"5 CTGTATCCCT CCCCCATGAC TCCCATGACC CCCATCACTC CTGCCACGCC AGCTTCGGAG 3000 

^ AGTTCTGGGA TTGTACCGCA GCTGCAAAAT ATTGTATCCA CAGTGAATCT TGGTTGTAAA 3060 

lAJ 30 CTTGACCTAA AGACCATTGC ACTTCGTGCC CGAAACGCCG AATATAATCC CAAGCGGTTT 3120 

D GCTGCGGTAA TCATGAGGAT AAGAGAGCCA CGAACCACGG CACTGATTTT CAGTTCTGGG 3180 

AAAATGGTGT GCACAGGAGC CAAGAGTGAA GAACAGTCCA GACTGGCAGC AAGAAAATAT 3240 

GCTAGAGTTG TACAGAAGTT GGGTTTTCCA GCTAAGTTCT TGGACTTCAA GATTCAGAAC 3300 

ATGGTGGGGA GCTGTGATGT GAAGTTTCCT ATAAGGTTAG AAGGCCTTGT GCTCACCCAC 3360 

40 CAACAATTTA GTAGTTATGA GCCAGAGTTA TTTCCTGGTT TAATCTACAG AATGATCAAA 3420 

CCCAGAATTG TTCTCCTTAT TTTTGTTTCT GGAAAAGTTG TATTAACAGG TGCTAAAGTC 3480 

AGAGCAGAAA TTTATGAAGC ATTTGAAAAC ATCTACCCTA TTCTAAAGGG ATTCAGGAAG 3540 

45 

ACGACGTAAT GGCTCTCATG TACCCTTGCC TCCCCCACCC CCTTCTTTTT TTTTTTTTAA 3600 

ACAAATCAGT TTGTTTTGGT ACCTTTAAAT GGTGGTGTTG TGAGAAGATG GATGTTGAGT 3660 

50 TGCAGGGTGT GGCACCAGGT GATGCCCTTC TGTAAGTGCC CCTTCCGGCA TCCCGGATAT 3720 

CCTGCAGCCC AACACGGCCG CTCGAGCATG CATCTAGAGA ACGTCACGGC CGCGATCCCC 3780 

CTGTGCCTTC TAGTTGCCAG CCATCTGGTT GTTTGCCCCT CCCCCGTGCC TTCCTTGACC 3840 



CTGGAAGGTG CCACTCCCAC TGTCCTTTCC TAATAAAATG AGGAAATTGC ATCG CATTGT 3900 

50 
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10 



CTGAGTAGGT 
TGGGAAGACA 
AAGAATTGAC 
ACACCCTGTC 
AGCGGTCTCT 
AAATTAAAGC 
AAGTAATGAT 



GTCATTCTAT 
ATAGCAGGCA 
CCGGTTCCTC 
CACGCCCCTG 
CCCTCCCTCA 
AAGAAGGCTA 
AGAAATCATA 



TCTGGGGGGT 
TGCTGGGGAT 
CTGGGCCAGA 
GTTCTTAGTT 
TCAGCCCACC 
TTAAGTGCAG 
GAATTC 



GGGGTGGGGC AGGACAGCAA GGGGGAGGAT 
GCGGTGGGCT CTATGGGTAC CCAGGTGCTG 
AAGAAGCAGG CACATCCCCT TCTCTGTGAC 
CCAGCCCCAC TCATAGGACA CTCAACTTGG 
AAACCAAACC TAGCCTCCAA GAGTGGGAAG 
AGGGAGAGAA AATGCCTCCA ACATGTGAGG 



3960 
4020 
4080 
4140 
4200 
4260 
4286 



20 



25 



(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3263 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



30 



(ix) FEATURE: 

(A) NAME /KEY: exon 

(B) LOCATION: 1..3263 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 



40 



50 



ATCGATAAGC 


TGAGATCCGG 


CTAGAAACTG 


CTGAGGGCTG 


GACCGCATCT 


GGGGACCATC 


60 


TGTTCTTGGC 


CCTGAGCGGG 


GCAGGAACTG 


CTTACCGCAG 


ATATCCTGTT 


TGCCCCAATT 


120 


CAGCTGTTCC 


ATCTGTTCTT 


GGCCCTGAGC 


GGGGCAGGAA 


CTGCTTACCA 


CAGATATCCT 


180 


GTTTGGCCCA 


TATTCAGCTG 


TCTCTCTGTT 


CCTGACCTTG 


ATCTGAACTT 


CTCTATTCTC 


240 


AGTTATGTAT 


TTTTCCCATG 


CCTTGCAAAA 


TGGCGTTACT 


TAAGCTAGCT 


TGCCAAACCT 


300 


ACGGCTGGGG 


TCTTTCACGT 


TTATATCTAT 


GAGGGGAAGG 


ACCCAGAGTG 


GGGAAGCTGG 


360 


GATCTTGGGA 


ACACGCTTCT 


CTACATGGCA 


TTGTCTGCAC 


GGTGGAGTCC 


GGATCTGAGC 


420 


TTGGCTTGGT 


TTTTAAAACC 


AGCCTGGAGT 


AGAGCAGATG 


GGTTAAGGTG 


AGTGACCCCT 


480 


CAGCCCTGGA 


CATTCTTAGA 


TGAGCCCCCT 


CAGGAGTAGA 


GAATAATGTT 


GAGATGAGTT 


540 


CTGTTGGCTA AAATAATCAA GGCTAGTCTT TATAAAACTG 


TCTCCTCTTC 


TCCTAGCTTC 


600 


GATCCAGAGA 


GAGACCTGGG 


CGGAGCTGGT 


CGCTGCTCAG 


GAACTCCAGG 


AAAGGAGAAG 


660 



51 



10 



w 30 



# 



* 



CTGAGGTTAC 
CGTAAACTCC 
TTCCACACGT 
GTGCACACTG 
AGGACGCGGG 
TTGCGCCCGG 
GACTTCAACG 



CACGCTGCGA 
AGAGCAGCGA 
CACATGGGTC 
GCGCTCCAGG 
GCGCGTGACT 
ACTCGTCCAA 
TCCTGAGTAC 



ATGGGTTTAC 
TAGGCCGTAA 
GTCCTATCCG 
GAGCTCTGCA 
ATGCGTGGGC 
CGACTATAAA 
CTTCTCCTCA 
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GGAGATAGCT GGCTTTCCGG GGTGAGTTCT 720 

TATCGGGGAA AGCACTATAG GGACATGATG 780 

AGCCAGTCGT GCCAAAGGGG CGGTCCCGCT 840 

CTCCGCCCGA AAAGTGCGCT CGGCTCTGCC 900 

TGGAGCAACC GCCTGCTGGG TGCAAACCCT 9 SO 

GAGGGCAGGC TGTCCTCTAA GCGTCACCAC 1020 

CTTACTCCGT AGCTCCAGCT TCACCAGATC 1080 



15 CTCGAGAACG TCTCCCATGG GCTATCCCTA TGACGTCCCG GATTACGCAG TCATGGGCAG 1140 

CAGCCATCAT CATCATCATC ACAGCAGCGG CCTGGTGCCG CGCGGCAGCC ATATGGATCA 1200 

GAACAACAGC CTGCCACCTT ACGCTCAGGG CTTGGCCTCC CCTCAGGGTG CCATGACTCC 1260 

20 

CGGAATCCCT ATCTTTAGTC CAATGATGCC TTATGGCACT GGACTGACCC CACAGCCTAT 1320 

yg TCAGAACACC AATAGTCTGT CTATTTTGGA AGAGCAACAA AGGCAGCAGC AGCAACAACA 1380 

r 25 ACAGCAGCAG CAGCAGCAGC AGCAGCAGCA ACAGCAACAG CAGCAGCAGC AGCAGCAGCA 1440 

m GCAGCAGCAG CAGCAGCAGC AGCAGCAGCA GCAACAGGCA GTGGCAGCTG CAGCCGTTCA 1500 

Tl GCAGTCAACG TCCCAGCAGG CAACACAGGG AACCTCAGGC CAGGCACCAC AGCTCTTCCA 1560 

CTCACAGACT CTCACAACTG CACCCTTGCC GGGCACCACT CCACTGTATC CCTCCCCCAT 1620 

GACTCCCATG ACCCCCATCA CTCCTGCCAC GCCAGCTTCG GAGAGTTCTG GGATTGTACC 1680 

35 GCAGCTGCAA AATATTGTAT CCACAGTGAA TCTTGGTTGT AAACTTGACC TAAAGACCAT 1740 

TGCACTTCGT GCCCGAAACG CCGAATATAA TCCCAAGCGG TTTGCTGCGG TAATCATGAG 1800 

GATAAGAGAG CCACGAACCA CGGCACTGAT TTTCAGTTCT GGGAAAATGG TGTGCACAGG 1860 

40 

AGCCAAGAGT GAAGAACAGT CCAGACTGGC AGCAAGAAAA TATGCTAGAG TTGTACAGAA 1920 

GTTGGGTTTT CCAGCTAAGT TCTTGGACTT CAAGATTCAG AACATGGTGG GGAGCTGTGA 1980 

45 TGTGAAGTTT CCTATAAGGT TAGAAGGCCT TGTGCTCACC CACCAACAAT TTAGTAGTTA 2040 

TGAGCCAGAG TTATTTCCTG GTTTAATCTA CAGAATGATC AAACCCAGAA TTGTTCTCCT 2100 

TATTTTTGTT TCTGGAAAAG TTGTATTAAC AGGTGCTAAA GTCAGAGCAG AAATTTATGA 2160 

50 

AGCATTTGAA AACATCTACC CTATTCTAAA GGGATTCAGG AAGACGACGT AATGGCTCTC 2220 

ATGTACCCTT GCCTCCCCCA CCCCCTTCTT TTTTTTTTTT TAAACAAATC AGTTTGTTTT 2280 

55 GGTACCTTTA AATGGTGGTG TTGTGAGAAG ATGGATGTTG AGTTGCAGGG TGTGGCACCA 2340 



52 



10 



GGTGATGCCC 
CCGCTTCGAG 
CTACCTACAG 
AAACTACTGA 
GAGCAGTGGT 
AGTGATGATG 
GTAGAAGACC 



TTCTGTAAGT 
GGATCTTTGT 
AGATTTAAAG 
TTCTAATTGT 
GGAATGCCTT 
AGGCTACTGC 
CCAAGGACTT 



GCCCCTTCCG 
GAAGGAACCT 
CTCTAAGGTA 
TTGTGTATTT 
TAATGAGGAA 
TGACTCTCAA 
TCCTTCAGAA 
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GCATCCCGGA ATTCCTGCAG CCCAACGCGG 2400 

TACTTCTGTG GTGTGACATA ATTGGACAAA 2460 

AATATAAAAT TTTTAAGTGT ATAATGTGTT 2520 

TAGATTCCAA CCTATGGAAC TGATGAATGG 2580 

AACCTGTTTT GCTCAGAAGA AATGCCATCT 2640 

CATTCTACTC CTCCAAAAAA GAAGAGAAAG 2700 

TTGCTAAGTT TTTTGAGTCA TGCTGTGTTT 2760 



15 



20 



25 



30 



AGTAATAGAA 
TACAAGAAAA 
CATAACATAC 
GCTCAAAAAT 
ATGTATAGTG 
TGCTTTAAAA 
TGTTGTTAAC 
TTTCACAAAT 
TGTATCTTAT 



CTCTTGCTTG 
TTATGGAAAA 
TGTTTTTTCT 
TGTGTACCTT 
CCTTGACTAG 
AACCTCCCAC 
TTGTTTATTG 
AAAGCATTTT 
CATGTCTGGA 



CTTTGCTATT 
ATATTCTGTA 
TACTCCACAC 
TAGCTTTTTA 
AGATCATAAT 
ACCTCCCCCT 
CAGCTTATAA 
TTTCACTGCA 
TCC 



TACAC CACAA 
ACCTTTATAA 
AGGCATAGAG 
ATTTGTAAAG 
CAGCCATACC 
GAACCTGAAA 
TGGTTACAAA 
TTCTAGTTGT 



AGGAAAAAGC 
GTAGGCATAA 
TGTCTGCTAT 
GGGTTAATAA 
ACATTTGTAG 
CATAAAATGA 
TAAAGCAATA 
GGTTTGTCCA 



TGCACTGCTA 
CAGTTATAAT 
TAATAACTAT 
GGAATATTTG 
AGGTTTTACT 
ATGCAATTGT 
GCATCACAAA 
AACTCATCAA 



2820 
2880 
2940 
3000 
3060 
3120 
3180 
3240 
3263 



* 35 
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(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 371 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



45 



(ix) FEATURE: 

(A) NAME/KEY: Protein 

(B) LOCATION: 1..371 



50 



55 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 

Met Gly Tyr Pro Tyr Asp Val Pro Asp Tyr Ala Val Met Gly Ser Ser 
15 10 15 

His His His His His His Ser Ser Gly Leu Val Pro Arg Gly Ser His 
20 25 30 

53 
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^ 25 
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Met Asp Gin Asn Asn Ser Leu Pro Pro Tyr Ala Gin Gly Leu Ala Ser 
35 40 45 

Pro Gin Gly Ala Met Thr Pro Gly lie Pro lie Phe Ser Pro Met Met 
50 55 60 

Pro Tyr Gly Thr Gly Leu Thr Pro Gin Pro lie Gin Asn Thr Asn Ser 

65 70 75 80 

o 

Leu Ser lie Leu Glu Glu Gin Gin Arg Gin Gin Gin Gin Gin Gin Gin 
85 90 95 

Gin Gin Gin G in Gin Gin G in Gin Gin G in Gln_ Gln Gin Gin Gin Gin 

15 100 105 110 

Gin Gin Gin Gin Gin Gin Gin Gin Gin Gin Gin Gin Gin Gin Gin Ala 
115 120 125 

20 Val Ala Ala Ala Ala Val Gin Gin Ser Thr Ser Gin Gin Ala Thr Gin 

130 135 140 



Gly Thr Ser Gly Gin Ala Pro Gin Leu Phe His Ser Gin Thr Leu Thr 
145 150 155 160 

Thr Ala Pro Leu Pro Gly Thr Thr Pro Leu Tyr Pro Ser Pro Met Thr 
jD 165 170 175 

5*5 2 

=P Pro Met Thr Pro lie Thr Pro Ala Thr Pro Ala Ser Glu Ser Ser Gly 

|ij 30 180 185 190 

O He Val Pro Gin Leu Gin Asn He Val Ser Thr Val Asn Leu Gly Cys 

[S 195 200 205 

: ; r= 35 Lys Leu Asp Leu Lys Thr He Ala Leu Arg Ala Arg Asn Ala Glu Tyr 

=? 210 215 220 



Asn Pro Lys Arg Phe Ala Ala Val He Met Arg He Arg Glu Pro Arg 
225 230 235 240 

Thr Thr Ala Leu He Phe Ser Ser Gly Lys Met Val Cys Thr Gly Ala 
245 250 255 



Lys Ser Glu Glu Gin Ser Arg Leu Ala Ala Arg Lys Tyr Ala Arg Val 
45 260 265 270 

Val Gin Lys Leu Gly Phe Pro Ala Lys Phe Leu Asp Phe Lys He Gin 
275 280 285 

50 Asn Met Val Gly Ser Cys Asp Val Lys Phe Pro He Arg Leu Glu Gly 

290 295 300 

Leu Val Leu Thr His Gin Gin Phe Ser Ser Tyr Glu Pro Glu Leu Phe 
305 310 315 320 



Pro Gly Leu He Tyr Arg Met He Lys Pro Arg He Val Leu Leu lie 
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♦ * 
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325 330 335 

Phe Val Ser Gly Lys Val Val Leu Thr Gly Ala Lys Val Arg Ala Glu 
340 345 350 

lie Tyr Glu Ala Phe Glu Asn lie Tyr Pro lie Leu Lys Gly Phe Arg 
355 360 365 



Lys Thr Thr 
10 370 



(2) INFORMATION FOR SEQ ID NO: 17: 



"15 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



20 
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(ii) MOLECULE TYPE: protein 



iff ( ix) FEATURE : 

Wjf 25 (A) NAME/KEY: Protein 

=1= (B) LOCATION: 1..18 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 

Met Gly Ser Ser His His His His His His Ser Ser Gly Leu Val Pro 
15 10 15 

Arg Gly 
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